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Abstract 



We investigate mathematically a nonlinear approximation type approach recently 
introduced in [1] to solve high dimensional partial differential equations. We show 
I the link between the approach and the greedy algorithms of approximation theory 

Tij" ■ studied e.g. in [4j. On the prototypical case of the Poisson equation, we show that 

I a variational version of the approach, based on minimization of energies, converges. 

On the other hand, we show various theoretical and numerical difficulties arising with 
the non variational version of the approach, consisting of simply solving the first order 
QP ■ optimality equations of the problem. Several unsolved issues are indicated in order to 

' motivate further research. 

> ■ 

^ ! 1 Introduction 

Our purpose here is to investigate mathematically a numerical approach recently intro- 
duced in [1] to solve high dimensional partial differential equations. 

The approach is a nonlinear approximation type approach that consists in expanding 
the solution of the equation in tensor products of functions sequentially determined as the 
iterations of the algorithm proceed. The original motivation of the approach is the wish of 
its authors to solve high-dimensional Fokker-Planck type equations arising in the modelling 
of complex fluids. Reportedly, the approach performs well in this case, and, in addition, 
extends to a large variety of partial differential equations, static or time-dependent, linear 
or nonlinear, elliptic or parabolic, involving self-adjoint or non self- adjoint operators pro- 
vided the data enjoy some appropriate separation property with respect to the different 
coordinates (this property is made precise in Remark [T] below). We refer the reader to [1] 
for more details. 

In the present contribution focused on mathematical analysis, we restrict ourselves to 
the simplest possible case, namely the solution of the Poisson equation set with Dirichlet 
homogeneous boundary conditions on a two dimensional parallelepipedic domain Q = 
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0,^ X 0,y with 17^ C M and 17^^ C M bounded. In short, the approach under consideration 
then determines the solution u to 

- Au{x,y) = f{x,y) (1) 

as a sum 

U{x, y) = ^ rn{x) Sn{y), (2) 
n>l 

by iteratively determining functions r„(x), Sn{y), n > 1 such that for all n, rn{x) Sn{y) 
is the best approximation (in a sense to be made precise below) of the solution f (x, y) 

to —Av{x,y) = f{x,y) + A (Ylk<n-i'''kix) Sk{y)j in terms of one single tensor product 
r(x)s{y). We show that it is possible to give a sound mathematical ground to the approach 
provided we consider a variational form of the approach that manipulates minimizers of 
energies instead of solutions to equations. In order to reformulate the approach in such a 
variational setting, our arguments thus crucially exploit the fact that the Laplace operator 
is self-adjoint. It is to be already emphasized that, because of the nonlinearity of the 
tensor product expansion jS]), the variational form of the approach is not equivalent to 
the form ([II)-© (which is exactly the Euler-Lagrange equations associated to the energy 
considered in the variational approach). Our analysis therefore does not apply to the 
actual implementation of the method as described in flj. At present time, we do not know 
how to extend our arguments to cover the practical situation, even in the simple case of 
the Poisson problem. The consideration of some particular pathological cases, theoretically 
and numerically, shows that the appropriate mathematical setting is unclear. Likewise, it is 
unclear to us how to provide a mathematical foundation of the approach for non variational 
situations, such as an equation involving a differential operator that is not self-adjoint. 

On the other hand, the analysis provided here straightforwardly extends to the case of 
a A'^-dimensional Poisson problem with > 3 (unless explicitly mentioned). Likewise, our 
analysis extends to the case of elliptic linear partial differential equations set on a cylinder in 
M^, with appropriate boundary conditions. The only, although substantial, difficulty that 
may appear when the dimension N grows is the algorithmic complexity of the approach, 
since a set of coupled non- linear equations has to be solved (see Remark [2|). At least, the 
number of unknowns involved in the systems to be solved does not grow exponentially, as it 
would be the case for a naive approach (like for a finite differences method on a tensorized 
grid). This is not the purpose of the present article to further elaborate on this. 

Our article is organized as follows. Section [2] introduces the approach. The variational 
version of the approach (along with a relaxed variant of it) is described in Section [2m Ele- 
mentary properties follow in Sections 12.21 and [2?3l The non variational version is presented 
in Section 12.41 In Section [3] we show the convergence of the variational approach and give 
an estimate of the rate of convergence. Our arguments immediately follow from standard 
arguments of the literature of nonlinear approximation theory, and especially from those 
of |4]. The particular approach under consideration is indeed closely related to the so-called 
greedy algorithms introduced in approximation theory. We refer to [21 E] for some rele- 
vant contributions, among many. The purpose of Section [4] is to return to the original non 
variational formulation of the approach. For illustration, we first consider the case when 
the Laplace operator —A in ([TJ is replaced by the identity operator. The approach then 
reduces to the determination of the Singular Value Decomposition (also called rank-one 
decomposition) of the right-hand side /. This simple situation allows one to understand 
various difficulties inherent to the non variational formulation of the approach. We then 
discuss the actual case of the Laplace operator, and present some intriguing numerical 
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experiments, in particular when a non-symmetric term (namely there an advection term) 
is added. 

As will be clear from the sequel, our current mathematical understanding of the numer- 
ical approach is rather incomplete. Our results do not cover real practice. Some ingredients 
from the literature of nonlinear approximation theory nevertheless already allow for un- 
derstanding some basics of the approach. It is the hope of the authors that, laying some 
groundwork, the present contribution will sparkle some interest among the experts, and 
allows in a not too far future for a complete understanding of the mathematical nature 
of the approach. Should the need arise, it will also indicate possible improvements of the 
approach so that it is rigorously founded mathematically and, eventually, performs even 
better that the currently existing reports seemingly show. 

Acknowledgments: The authors wish to thank A. Ammar and F. Chinesta for intro- 
ducing them to their series of works initiated in [T|, A. Cohen for stimulating discussions, 
and A. Lozinski for pointing out reference [4j. This work was completed while the first au- 
thor (CLB) was a long-term visitor at the Institute for Mathematics and its Applications, 
Minneapolis. The hospitality of this institution is gratefully acknowledged. 



2 Presentation of the algorithms 

Consider a function / € L'^{Q) where $7 = 17^. x Qy with fi^; C M and fi^ C M two bounded 
domains. To fix ideas, one may take = i^y = (0,1). Consider on Q the following 
homogeneous Dirichlet problem: 

Find g £ H).(n) such that j " { ™ ^' (3) 

It is well known that solving ([3]) is equivalent to solving the variational problem: 

Find g € HQ(n) such that g = arg min ( - / iVuP - [ fu] . (4) 

In the following, for any function u S i/o(r2), we denote 

E{u) = \\ |Vn|2 - / /u. (5) 
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Notice that 



vt Jn 



(6) 



where g is defined by ([3]), so that minimizing £ is equivalent to minimizing \S/{u — g)Y 
with respect to u. We endow the functional space Hq{VL) with the scalar product: 



(u, v) = Vn • Vv, 
Jn 



and the associated norm 

\\uf = {u,u) = I \Vu\''. 



a 
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2.1 Two algorithms 

We now introduce two algorithms to solve ([3]). The first algorithm is the Pure Greedy 
Algorithm: set fo = f, and at iteration n > 1, 

1. Find rn G Hq{Qx) and Sn G HQ^Qy) such that 



(rn, Sn)=arg min - / |V(r(gis)| - / r s . (7) 

2 . Set /„ = fn-\ + A(r„ ® s„) . 

3. If ||/n||H-i(n) ^ ^) proceed to iteration n+1. Otherwise, stop. 

Throughout this article, we denote by r(8)s the tensor product: r(^s{x,y) = r{x)s{y). 
Notice that 



fn = f + ^ 1^5^ rk Skj . 

The fonction fn belongs to H~^{Q) and the tensor product r(8)s is in Hq{Q) if r G Hq{^x) 
and s G i/o(r2y) (see Lemma [U below), so that the integral r s in (l7| is well 

defined. 

A variant of this algorithm is the Orthogonal Greedy Algorithm: set /q = / , and at 
iteration n > 1, 

1. Find r° G HQ{il.x) and s° G H^i^ly) such that 

(r°,0=arg min {\W{r®s)\^-f°_^r®s 

2. Solve the following Galerkin problem on the basis (r°®s°, . . . , r°(8)s°) : find 
(ai, . . . , a„) G M" such that 



fc=i 



(9) 



3. Set /° = / + A(ELi«A.rfc°®4). 

4. If ||/^||_H'-i(r2) > proceed to iteration n + 1. Otherwise, stop. 
Let us also introduce gn satisfying the Dirichlet problem: 

-Agn = fn in 

gn = on c?r2. 



(10) 



Notice that 

gn= Qn-l-rn® Sn- (11) 

SO that gn = 9 — Z]fc=i '^fc ® Sfc. Likewise, we introduce gn = 9 ~ Y12=i''^k *^ '^fc' which 
satisfies —A.g° = f° in Q and g° = on Sfi. Proving the convergence of the algorithms 
amounts to proving that gn and g'j^ converge to 0. 

The terminology Pure Greedy Algorithm and Orthogonal Greedy Algorithm is borrowed 
from approximation theory (see [21 El HI E]). Such algorithms have been introduced in a 
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more general framework, namely for an arbitrary Hilbert space and an arbitrary set of 
functions (not only tensor products). Recall for consistency that, in short, the purpose of 
such nonlinear approximations techniques is to find the best possible approximation of a 
given function as a sum of elements of a prescribed dictionary. The latter does not need to 
have a vectorial structure. In the present case, the dictionary is the set of simple products 
r{x)s{y) for r varying in Hq{^x) and s varying in HQ{Qy) (All this will be formalized 
with the introduction of the space in Section [3] below). The metric chosen to define the 
approximation is the natural metric induced by the differential operator, here the norm. 
The algorithm proposed by Ammar et al. [l] is actually related to the Orthogonal Greedy 
Algorithm: it consists in replacing the optimization procedure ([8|) by the associated Euler- 
Lagrange equations. We shall give details on this in Section [231 below. For the moment, 
we concentrate ourselves on the variational algorithms above. 

2.2 The iterations are well defined 

We will need the following three lemmas. 

Lemma 1 For any measurable functions r : Vl^ ^ and s : ilj^ ^ M such that r (8) s / 

r O s E HI{VL) r € FqI^xO and s G Hl{Vty). 

Lemma 2 Let T G 2?'(J7) he a distribution such that, for any functions ((/), "0) G C'^i^x) x 

(r,0o v)(D'{n),i?{Q)) = 

then T = in V^Q). Moreover, for any two sequences of distributions Rn £ V^H.x) 
and Sn G 'D'{Q,y) such that lim^^oo -Rn = R in D'(r2^) and lim„^oo •S'n = S in D^VLy), 
lim„^oo Rn ® Sn = R ® S in V'{n) . 

Lemma 3 Let us consider a function f G L^(r2). /// ^ 0, then 3(r, s) G H^{Q.x) ^H^^Qy) 
such that 

£{r s) < 0, 

where £ is defined by ([5]). 

Lemma [2] is well-known in distribution theory. We now provide for consistency a short 
proof of Lemmas [T] and [3j respectively. 

Proof of Lemma [T] Notice that 

j |v(r®5)|2= [ |rf / j |r|2 j |sf 

where ' denotes henceforth the differentiation with respect to a one-dimensional argument. 
Thus, it is clear that if r G Hq{Q.x) and s G HQ{Q.y), then r ® s G Hq{^). Now, when 
r (g) s G Hq{Q), we have k'P/n kl^ < ^ ^^i^ jn I'^'l^ ^ This implies 

r G Hq{U,x) and s G i?o(J7j;), since r / and s / 0. 

Proof of Lemma [3] Fix / G L2(Q) and assume that for all (r, s) G Hl{Q.x) x ifg(r2j^), 
£{r O s) > 0. Then, for a fixed (r, s) G Hl{Vtx) x Hl{Vt.y), we have, for all e G M, 

^— j |V(r > e j fr(g)s. 

By letting e — > 0, this shows that / G {r ® s, (r, s) G L'^{Qx) x L'^{Qy)}-^ which implies 
/ = (by Lemma [2]) and concludes the proof. 

The above lemmas allow us to prove. 
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Proposition 1 For each n, there exists a solution to problems ^ and ([8]). 

Proof. Without loss of generality, we may only argue on problem jT]) and assume that 
n = 1 and fo = f ^ 0. First, using ((H), it is clear that 

I I |V(r®s)|2- / fr(^s = \ [ |V(r®s-5)|2-i / jV^P 
^ Jn Jn Jn ^ Jo. 

Thus, we can introduce m = 'y^i(r^s)^ul{si^)>,Hl{siy) [\ In ® - In f ^ ^ ^) and a 
minimizing sequence {r^,s^) such that lim^^oo '^('^'^ ® s'^) = m. Notice that we may 
suppose, again without loss of generality (up to a multiplication of s'^ by a constant), that 

Jn 

Since £{u) > j |Vup — IV^P, the sequence {r^ Cg) s^) is bounded in Hq{Q): there 
exists some C > such that, for all A; > 1, 

/ / / / l(s'')?<c. (12) 

From this we deduce the existence of G i?Q(0), r G L^(ria;) and s G H^{Qy) such that 
(up to the extraction of a subsequence): 

• r*^ (g) s'^ converges to w weakly in Hq{Q), and strongly in L^(0), 

• r'^ converges to r weakly in ^^(rij;), 

• s'' converges to s weakly in ffo(J7y), and strongly in L'^{VLy). 

Since r^ (g) converges to w weakly in Hq{Q) and £ is convex and continuous, we have 
£{w) < liminfjt_^oo ('^'^ ® s^)- This yields £{w) < m. Moreover, by Lemma [H we know 
m < 0. Therefore, 

£{w) < 0. (13) 

The convergences ^ r and ^ s in the distributional sense imply the convergence 
r*' (g) s'^ — > r (g) s in the distributional sense (see Lemma [2]), and therefore w = r ® s. Thus, 
if w / 0, Lemma [T] concludes the proof, showing that indeed r G Hq{Q.x)- Now, we cannot 
have w = 0, since this would imply £{w) = 0, which would contradict (fT3]l . This concludes 
the proof. 

The optimization step ([9]) admits also a solution by standard arguments and we there- 
fore have proven: 

Lemma 4 At each iteration of the Pure Greedy Algorithm, problem ([^ admits (at least) 
a minimizer {rn,Sn)- Likewise, at each iteration of the Orthogonal Greedy Algorithm, 
problem {^j admits (at least) a minimizer (r°,s°). 

It is important to note that, in either case, uniqueness of the iterate is unclear. Through- 
out the text, we will thus be refering to the functions (r„,s„) (resp. (r°,s°)) although we 
do not know whether they are unique. However, our arguments and results are valid for 
any such functions. 
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2.3 Euler-Lagrange equations 

Our purpose is now to derive the Euler-Lagrange equations of the problems considered, 
along with other important properties of the sequences (r„, s„) and (r°, We only state 
the results for (r„,s„). Similar properties hold for (^'^jS^), replacing /„ and gn by f° and 
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The first order optimality conditions write: 

Proposit ion 2 The functions (r^, Sn ) G //d(r2s) X HQ{^}y) satisfying ^ are such that: 
for any functions (r, s) G Hq{^Ix) x HQ{Qy) 



/ V(r„ (g) s„) • V(r„ (g) s + r (g) s„) = / /„_i(r„ (g s + r (g 
This can be written equivalently as 



(14) 



|2 „// 



r" + 



1 2 



' 1 2 



fn—1 Sr, 



fn—1 fn 



(15) 



or, m terms of gn, as: 



{gn, (r (g s„ + r„ s)) = 0. 



(16) 



Proof. Equation (fT4l) is obtained differentiating jT]). Namely, for any (r, s) e -frg(r2i.) x 
i!fo(S7y) and any e G M, we have 

l I |V((r„ + er)0(s„ + es))p- / (r„ + er) (s„ + es) 
^ Jn Jn 

>\ I |V(r„®s„)|2- / /„_ir„®s„. (17) 



It holds: 



^ / |V((r„, + er)«)(sn + es))p- / (r„ + er) (s„ + es) 



^ / |V(r„ s„) +eV(r g) s„ + r„ ® s) + e^V(r s)'^ 

1 

2 



/ /n-l (^n + f^r) g) (Sn + es) 



|V(r„(gs„)p- / /„-ir„(gs„ 
Jn 

+ e( / V(r,i g) s„) • V(r (g s„ + gi s) - / /n-i(rn <g s + g> Sn) 



|V(r®s„ + r„® s)|^+ / V(r„ s„) • V(r s) - / /„_ir ® s + O(e^) 



= J / |V(r„0s„)p- / /„_ir„g)s„ + e/i + e2/2 + 0(e3 
^ Jn Jn 

Using ([El, we get, for any e G M, 

eh + e^h + 0{e^) > 0, 
which implies that Ii is zero, that is, (fT4l) . 



(18) 
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Equation (fT5]l is the strong formulation of (fHl) . On the other hand, ([T6]) is an immediate 
consequence of the following simple computations: 

(5n, {r^Sn + Tn^ s)) = {Qu-I -Tn® Sn, (r 5,^ + r„ ® s)) 

V{gn-i -rn<^ Sn) ■ V(r s„ + r„ (gi s) 

= - / Agn-i{r (E) Sn + Tn (E) s) - / V(r„ (g) s„) • V(r s„ + r„ (g) s) 
= 0, 

since —Agn-i = fn-i in ^ and 5n-i = on 50. ^ 
Note that, taking r = Vn and s = in the Euler-Lagrange equations (fTBl) yields 

(''n O Sn,5n-l) = Ikn gi Snip, (19) 

since 5„ = 5n-i — '"n <8 s„. This will be useful below. 

Let us now state two other properties of (r„,s„). The second order optimality condi- 
tions write: 

Lemma 5 The functions [vn, s„) G Hq{Qx) x -f^o(^y) satisfying JT)) are s«c/i that: for any 
functions (r, s) G i/Q(r2^) x HQ{Qy) 

I [ |V(r®s„ + r„®s)p+ /" V(r„ s„) • V(r g) s) - / /„-ir ® s > 0, (20) 
^ JO Jo Jn 

which is equivalent to: for any functions (r, s) S ifQ(J72.) x {{^{^y) 

V(r„ ® s„ - 5„) • V(r s)^ < / |V(r®s„)|2 / |V(rn®s)p. (21) 



Proof. Returning to Equation (fTSll . we see that Ii = and /2 > 0, which is exactly (f20]l . 
For any A G M, taking (Ar, s) as a test function in ([20]) shows 



|AV(r®s„)+V(r„®s)p+ / AV(r„ ® s„) • V(r ® s) - / /„-iArg)s>0 



1 
2 

This equivalently reads 

4" / \V{r® Sn)f + x( [ (V(r s„) • V(r„ g) s) + V(r„ s„) • V(r g) s)) - / /„_ir s 
^ JO \JO Jo 



+ i / |V(r„®s)|2 > 0, 
^ JO 



hence 

[ (V(r Sn) • V(r„ (g s) + V(r„ s„) • V(r s)) - /" /n_ir (g s 
JO JO 

< / \V{r0Sn)\^ I |V(r„®s)|^ 



This yields 
We will also need the following optimality property of (r„, s„): 



Lemma 6 The functions (r„,s„) satisfying ^ are such that: V(r, s) S Hq{Qx) x ^o(^j/) 

II II _ {rn Sn,gn-l) (r (g) g,gn-l) 

Proof. We may assume without loss of generality that n = 1. The first equality is ([19]). 
To prove the inequality, let us introduce the supremum: 

M = sup (n 1^ v,g). 

(u,v)&Hl^{nx)xH^{ny),\\um\\=i 

Using (fT9]l . we have 

lhC5.i|| = ^^l^^<M, (22) 

by definition of M. On the other hand, consider (^i*^, f '^)fc>o a maximizing sequence as- 
sociated to the supremum M: \\u'' v'^W = 1 and limfc^oo(w^ ® f'^,/?) = M. We have, 
using ([7]), for all /c > 0, 

\\g - ri (g) Slip < ll^f - {g,u'' v^) v!^ (g> v^\\'^ 
= \\gf -{g,u^ ®v^f, 

and, letting k ^ oo, 

\\g-n(g)sif < \\gf -M"^. (23) 
Combining ^ and we get 

lb - ri Slip < \\gf - 

< ll^lP — lln (g) Slip 

= ll^lp - 2{g,ri (g si) + lln (g Slip 

= h - n ® Slip 

so that all the inequalities are actually equalities. By using the fact that, by ([22]) . M > 0, 
we thus have 

^J II ^ II si,g) 
M = ri (g si = -r ^. 

lln (g Sill 

This concludes the proof. ^ 



2.4 Some preliminary remarks on the non variational approach imple- 
mented 

Before we get to the proof of the convergence of the approach in the next section, let us 
conclude Section [2] by some comments that relates the theoretical framework developed 
here to the practice. 

It is important to already note, although we will return to this in Section |4] below, that 
the Euler-Lagrange equation is indeed the form of the algorithm manipulated in practice 
by the authors of [l]. The above variational setting is somewhat difficult to implement 
in practice. It requires to solve for the minimizers of ([7} (and ^ respectively), which 
can be extremely demanding computationally. In their implementation of the approach 
(developed independently from the above nonlinear approximation theoretic framework), 
Ammar et al. therefore propose to search for the iterate (r„, Sn) (and respectively (r°, s°)) 
not as a minimizer to optimization problems ^ and ([8]), but as a solution to the associated 
Euler-Lagrange equations (first order optimality conditions). The Pure Greedy algorithm 
is thus replaced by: set /o = /, and at iteration n > 1, 
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1. Find r„ £ Hq{Qx) and s„ G {{^{Qy) such that, for all functions (r, s) G 
ff^(O^) X , ([ni) (or its equivalent form ([15|)) holds. 

2. Set /„ = + A(r„ O s„) . 

3. If ||/n||//-i(Q) > proceed to iteration Otherwise, stop. 
The Orthogonal Greedy Algorithm is modified likewise. 

As already explained in the introduction, and in sharp contrast to the situation en- 
countered for linear problems, being a solution to the Euler-Lagrange equation does not 
guarantee being a minimizer in this nonlinear framework. We will point out difficulties 
originating from this in Section [H 

In addition to the above theoretical difficulty, and in fact somehow entangled to it, we 
have to mention that of course, the Euler-Lagrange equations (fTSj) . as a nonlinear system, 
need to be solved iteratively. In [1], a simple fixed point procedure is employed: choose 
(r°,s°) G Hq{^}^) X H^i^y) and, at iteration k>0, compute (r^,s^) G Hq{Q,x) x H^{^ly) 
solution to: 

(24) 

until convergence is reached. We will also discuss below the convergence properties of this 
procedure on simple examples. 

Remark 1 In practice (bearing in mind that the approach has been designed to solve high- 
dimensional problems), in order for the right-hand side terms in (f24ll to be computable, 
the function f needs to be expressed as a sum of tensor products. Otherwise, computing 
high dimensional integrals would be necessary, and this is a task of the same computational 
complexity as the original Poisson problem. The function f thus needs to enjoy some 
appropriate separation property with respect to the different coordinates. 

If f is not given in such a form, it may be possible to first apply the Singular Value 
Decomposition algorithm to get a good estimate of f as a sum of tensor products (see 
Section \4.1\ ). 

Remark 2 In dimension N > 2 (on a parallelepipedic domain $7 = x ... x O.^^.^), 
the Euler-Lagrange equations (fH|l become: find functions (r^,...,r^) G Hq{Qxi) x ... x 
-f^o(^a;iv) SMc/i that: for any functions {r^,. . . ,r^) G Hq{Qxi) x . . . x H^{^xi.i), 

N 

/ V(r^ ... 8) r^) • V V(r^ (g) . . . O r^~^ (gi r'' (g) r^+^ ... r^) 
■^^ k=i 

N 

= / V(r^®...®r^-^®r^®r^+^® ...®r^). (25) 

■^^ k=i 

This is a nonlinear system of N equations, which only involves one- dimensional integrals 
by Fubini theorem, provided that the data f is expressed as a sum of tensor products (see 
Remark\^ . 

Remark 3 We presented the algorithms without space discretization, which is required for 
the practical implementation. In practice, finite element spaces (resp. Vy) are used to 



10 



discretized Hq{Qx) (resp. HQ{Qy)), where h> denotes a space discretization parameter. 

- jh 
y 



The space discretized version of (I14p thus writes: find (r^,s^) G x Vy such that, for 



any functions {r^, s^) £ x Vy 

V(r^ 4) • V(r:^ ^s^ + r'^0st)= [ ftiir'^, C3 / + r'^ 4)- (26) 
n Jn 

3 Convergence 

To start with, we prove that the approach converges. Then we will turn to the rate of 
convergence. 

3.1 Convergence of the method 

Theorem 1 [Pure Greedy Algorithm] 

Consider the Pure Greedy Algorithm, and assume first that satisfies the Euler- 

Lagrange equations (fT4l) . Denote by 

En = l I |V(r„ s„)|2 - / r„ Sn (27) 
^ Jn 

the energy at iteration n. We have 

5^ /" |V(r„®s„)|2 = -2j^K < oo. (28) 

n •'^ n 

Assume in addition that {rn,Sn) is a minimizer of ([7]). Then, 

lim 5„ = in H^{n). (29) 

n— >oo 

Immediate consequences of (f28l) and (f29]l are 



lim = lim ||r„ (8) Sn || = 0, 

n— >oo n— »oo 



and 

lim /n = in H'^Q). 

n— >oo 

Proof. Let us first suppose that (r-„,Sn) only satisfies the Euler-Lagrange equations (fT4l) . 
We notice that, using (fTBl) 

||5n-l|P = Ibn + r„ O 

= Ibnf + (30) 
Thus, ||(7n|P is a nonnegative non increasing sequence. Hence it converges. This implies 

that \^i^n ® Sn)P < OO. 

Next, notice that 

^n = 7; |V(r„ (gi s„)p - / /„_ir„(g)s„ 

_ 1 
~ 2 



V(r„(gis„)| - / V5„_i • V(r„ (g) s„) 
n Jn 

|V(r„® s„)p, 
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since by (fTOl) . /^^ V^n-i • V(r„ ® s^) = /j^ |V(r„ Sn)P- This proves the first part of 
the theorem. At this stage, we have only used that (rn,s„) satisfies the Euler-Lagrange 
equations (fTSll . 

To conclude that lim„^oo fn = 0, we now need to assume that (rni^n 

) indeed satisfies 

the minimization problem jT]). We know that HffnlP is a bounded sequence, and therefore, 
we may assume (up to the extraction of a subsequence) that Qn converges weakly in Hq{Q) 
to some Qoo £ Hq{^)- For any n > 1 and for any functions {r,s) G i/g(r2a;) x HQiyty)^ 



^ Jn Jn 



2 

By passing to the limit this inequality, we have 

I I |V(r®s)|2- / Vc/oo- V(r®s) > 0. 
^ Jn Jn 

This implies that for any functions (r, s) E Hq{^Ix) x HQ(^ly), 

[ V^oo • V(r s) = 0. 
Jn 

Thus, by Lemma O —Ag^o = in the distributional sense, which, since goo S Hq{Q), 
implies ^oo = 0. This shows that there is only one possible limit for the subsequence gn 
and thus that the whole sequence itself weakly converges to 0. 

The convergence of gn to is actually strong in Hq{Q,). The argument we use here is 
taken from [5]. Using Lemma [U we have: for any n > m > 

\ V k=m+l / I 

n 

= Ibnll^ + Ibmll^ - 2||5n|P - 2 ^ {gn,rk®Sk) 

k=m+l 

n 

< -IbnlP + ll^mll^ + 2 ^ Ikfc <8) Sfclllkn+1 l» S„+l||- 

fc=m+l 

Define = 1, 0(2) = arg min„>^(i){||r„ (g) < \\r^(^i) (g) and, by induction, 

4>{k + 1) = arg mill {||r„ (g < \\r^,k) s<^(fc)ll}- 

n>(j>{k) 

Notice that limfc^oo </'(^) = oo since limfc^oo \\i"k <^ = 0. For example, if (||rfc Sfc||)fc>i 
is a decreasing sequence, then cj}{k) = k. Now, we have: for any I > k > 

4>{i)-i 

1150(0-1 -9<p(k)~i\? < -\\9,t>{i)-if + \\94>{k)-i\f + 2 X] ® SilllkfliW •5</.(/)ll 
< -||5</>(/)-if + Ib^Cfc)-!!!^ + 2 ^ Iki^Sif. 

i=^{k) 

Since X^fc>i H'^fc < oo and (||fl'n||)n>i is converging, the previous inequality shows 
that the subsequence (50(/)-i)i>o is a Cauchy sequence, and therefore strongly converges 
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to (recall it is already known that gn weakly converges to 0). Since \\gn\\ is itself a 
converging sequence, this shows that 

lim \\gn\\ = 0. 

n— >oo 



A similar result holds for the Orthogonal Greedy Algorithm. 

Theorem 2 [Orthogonal Greedy Algorithm] 

Consider the Orthogonal Greedy Algorithm, and assume first that only sat- 

isfies the Euler-Lagrange equations (fT4|l associated with ^ (thus with {rn,Sn,fn-i) = 
(r°,s°,/°_i) m (HID;. Denote by 

K = \j |V(r°®<)|2- f f^.^r^^s"^ (31) 
^ Jn Jn 

the energy at iteration n. We have 

E/ |V(rX)|2 = -2^i5;°<oo. (32) 

Assume in addition that (r°,s°) is indeed a minimizer to the optimization problem 
Then, 

lim 5° = «n hI{VL). (33) 

n— >oo 

Immediate consequences of ((321) and (f33]l are 

lim = lim |K s^|| = 0, 

n— >oo n— >oo 

and 

lim /° = in if-^(Q). 

n— >oo 

Proof. Let us first assume that (r^,s^) only satisfies the Euler-Lagrange equations (fT4ll 
(with (r„,s„,/„_i) = (r°,<,/°_i) in dni). Notice that by (© and (HH): 



fc=i 

„o ^ „o ||2 



In" l|2 _ \\r° f5?) «° l|2 
\9n-l\\ \Vn ^ ^n\\ • 



Thus, 1 1 5^ IP is a nonnegative non increasing sequence. Hence it converges. This implies 
that X^fc>i \\r'^ s^lP < oo, and proves the first part of the theorem, using the same 
arguments as in the proof of Theorem [H 

Let us now assume in addition that (r°,s°) is a minimizer to dH]). For fixed r and s, 
we derive from ([8|): 

_1 f |v(rX)|2 = i f |V(r°®<)|2- /" /^ir><<i / \V{r0s)\'-[ f^^.r^s. 

Letting n go to infinity, and using the same arguments as in the proof of Theorem [H this 
implies that 5° weakly converges to in H^{^). The proof of the strong convergence of 
(7° to zero is then easy since, using the Euler Lagrange equations associated to dH]): 

and the right-hand side converges to 0. ^ 



13 



3.2 Rate of convergence of the method 

We now present an estimate of the rate of convergence for both the Pure and the Orthogonal 
Greedy Algorithms. These results are borrowed from [4j. We begin by only citing the 
result for Pure Greedy Algorithm. On the other hand, with a view to showing the typical 
mathematical ingredients at play, we outline the proof of convergence of the Orthogonal 
Greedy Algorithm, contained in the original article [4]. 

We first need to introduce a functional space adapted to the convergence analysis 
(see [21 H). 

Definition 1 We define the C} space as 

= {9 = ^CkUk 8) Vk, where Uk € H^i^x), Vk G H^iny), \\uk "Ufcll = 1 and ^ \ck\ < 



k>0 



k>0 



and we define the C^-norm as 



\\g\\c^ = inf < ^ \ck\,g = ^CkUk O Vk, where \\uk (^Vk\\ = l> , 

[fc>0 fc>0 J 

for g £ . 

The following properties may readily be established: 

• The space is a Banach space. 

• The space is continuously embedded in Hq{^1). 

Notice that, in the definition of , the function g = Ylik>o ^k^k ®Vk is indeed well defined 
in Hq{Q) as a normally convergent series. This also shows that C i?o(0), and this 
imbedding is continuous by the triangle inequality || ^k>Q ^kUk Vk\\ < Y^k>o \ ^k\- 

We do not know if there exists a simple characterization of functions in C^. Let us 
however give simple examples of such functions. 

Lemma 7 For any m>2, H"^{Q) n Hl{n) C . 

Proof. Without loss of generality, consider the case Q.x = = (0, 1). Using the fact that 
{(pk ^ (t)i,k,l > 1}, where 4>k{x) = \/2 sin(/c7ra;), is an orthonormal basis of L'^{Q), we can 
write any function g £ L'^{9) as the series g = Yyk,i>i 9k,i(l>k'^ (f>i, where gk,i = j'^g (pk'^ <Pi- 
It is well known that 



g G Hl{^) ^ 

and, more generally, for any m > 1, 

g£H'^{Vt)r^Hl{Vt) 



k,l>l 



\gk,i\'^ik'^ + 1^) < 00 



1 2^.2 I j2\m 



k,l>l 



On the other hand. 



E 

kd>l 



^ 9k,l\\<Pk ® <Pl\ 



4>k 18 (pi 



kl>l 



kJ>l 



\9k,l 



7r\/fc2 + /2 
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since ||(/>a; </)i|| = tt^/W^TT^ ■ Thus, by the Holder inequahty, we have, for any m > 2, if 

k,l>l 

1/2 / \ 1/2 

1— m 



\k,l>l I \k,l>l 

< OO, 

since J2k />i(^^ + ^^)^~'" < oo as soon as m > 2. ^ 

Remark 4 More generally, in dimension N > 2, the same proof shows that: for any 
m > 1 + N/2, H'^ln) n H^{n) C C^. 

Let us now give the rate of convergence of the Pure Greedy Algorithm. For the details 
of the proof, we again refer to [4]. The proof is based on the fundamental lemma: 

Lemma 8 (j4l, Lemma 3.5]) Let us assume that g & . Then, for any re > 0, g„ G £^ 

and we have: 

II ^ II {gn,rn+l^ Sn+l) . HSnlP 

||r„+i (g) \\gn\\c^ 
The following technical result (easily obtained by induction) is also needed. 

Lemma 9 ([4l Lemma 3.4]) Let (a„)n>i o, sequence of non-negative real numbers and 
A a positive real number such that ai < A and an+i < On (l — ■ ^^^^^ ^ 1; 

A 

an < —■ 

n 

Using Lemma E] and Lemma [9], it is possible to show: 
Theorem 3 ([4, Theorem 3.6]) For g £ , we have 

\\9n\\ < \\gf/%\\fn-'/'. (34) 

A better rate of convergence can be proven for the Orthogonal Greedy Algorithm. For 
the Orthogonal Greedy Algorithm, the following Lemma plays the role of Lemma [8l 

Lemma 10 Assume that g £ C^. Then, for any n > 0, (7° G £^ and we have: 

|UO ^ o II «,^^+l^<+l) IKf 



Proof. Since gn = 9 — Y^^=i^kfk ® Sfc, it is clear that g„ G . The equality \\r'^j^i 
ciated to the optimization problem on (r°_,_^, s°_,_^) (see (fT9]) 



■^n+ill = ||"o "^i,„o "'^ii is obtained as a consequence of the Euler-Lagrange equations asso- 

II n + l^ n + l II 
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Since g £ C^, for any e > 0, we can write g = J2k>o '^kUk i^k with \\uk ®Vk\\ = 1, and 
Ylik>o \^k\ < Ibll^i + By (IH), we have {g — gniOn) — 0) therefore, using Lemma[6l 



i9rv^CkUk®Vk 
\ k>0 I 

"^Ckig^jUk 0vk) 



k>0 
fc>0 



n+1/ 



' ra+1 



'n+1 1 



from which we conclude letting e vanish. 







Theorem 4 ([H Theorem 3.7]) For g £ we /ia?;e 

m<\\9\\c^n-'/\ 
Proof. We have, using (fTHI) and Lemma [TOl 



n+1 



akrl ® Sfc 



fc=i 

<\\9n-r° 



n+1 ^ ^n+ll 



li/n II 



''n+1 ® ^n+1 



<\K\?\i 



•,o||2 



MP 

Ibll^i 



(35) 



The conclusion is reached applying Lemma [9] with a„ = and A 







Remark 5 The rate of convergence of the Pure Greedy Algorithm in (l34l) may he improved 
to n^^^/^^ For both algorithms, it is known that there exists dictionaries and right-hand 
sides f (even simple ones, like a sum of only two elements of the dictionary) such that the 
rate of convergence is attained (see In that sense, the Orthogonal Greedy 

Algorithm realizes the optimal rate of convergence. Notice that this rate of convergence 
does not depend on the dimension of the problem. However, the assumption g £ seems 
to be more and more demanding, in terms of regularity, as the dimension increases (see 
Remark\^. 



4 Discussion and open problems 

We begin this section by considering the case when the Laplace operator is replaced by 
the identity operator. We examine on this simplified case the discrepancy between the 
variational approach consisting in minimizing the energy and the non variational approach 
solving the Euler-Lagrange equation. 
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4.1 The Singular Value Decomposition case 

The algorithms we have presented above are closely related to the Singular Value Decom- 
position (SVD, also called rank one decomposition). More precisely, omitting the gradient 
in the optimization problem ^ yields: find r„ G L'^{Qx) and s„, G L'^{Qy) such that 

with the recursion relation 

9n — 9n—l ~ ffi® Sn, 

and go = g. 

In view of the exact same arguments as in the previous sections, the series X]„>i rn®Sn 
can be shown to converge to g in Lp'{^l). This problem has a well-known companion 
discrete problem, namely the SVD decomposition of a matrix (see for example p]). This 
corresponds to the case Q,x = {1, . . . Oj, = {1, . . . the integral is replaced by 
the discrete sum X](ij)Gi pxi G is a matrix in W'^'^ and {Rn,Sn) are two (column) 
vectors in MP x R'^. In this case the tensor product Rn Sn is simply the matrix Rn{Sn)^ ■ 
The matrices G„ G W^'^ are then defined by recursion: Gq = G and Gn = Gn-i—Rn{Sn)'^ ■ 

4.1.1 Orthogonality property 

An important property of the sequence (r„,Sn) generated by the algorithm in the SVD 
case is the orthogonality relation: if n / m 



In order to check this, let us first write the Euler-Lagrange equations in the SVD case 
(compare with ([HI)): for any functions (r, s) G L?'{Qx) x L'^i^y), 

{rn®s + r®Sn)= I gn-i{rn® s + r ® Sn)- (38) 



This also reads (compare with (1151) ): 

It is immediate to see that (f38]l for n = 1 and n = 2 implies, 

/ (r2 (g S2)(r2 si) = / (r2 S2){ri S2) = 0. 
Likewise, it can be shown, for any n > 2 and any / G {2, . . . n} 

„ n „ n 

/ ^{rk ® Sk) (r„ s/_i) = / ^(r-fc (g) Sk) {n^i ® s„) = 0. 



(39) 



(40) 



The orthogonality property (f37l) is then easy to check using the Fubini Theorem and 
arguing by induction. 
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Remark 6 A simple consequence of the orthogonality of the functions obtained by the algo- 
rithm is that, in the discrete version (SVD of a matrix G G MP^'^) the algorithm converges 
in a finite number of iterations (namely ma.x{p,q)). As usual in this situation, practice 
may significantly deviate from the above theory if round-off errors due to floating-point 
computations are taken into account. This is especially true if the matrix is ill conditioned. 

4.1.2 Consequences of the orthogonality property 

The orthogonality property has several consequences: Assume the SVD to be nondegener- 
ate in the sense 

9 = '^KUn^Vn, (41) 
n>l 

with 

Vn, m, and iXn)n>i positive, strictly decreasing, (42) 

where Sn,m is the Kronecker symbol. Then 

• (i) The Pure Greedy Algorithm and the Orthogonal Greedy Algorithm are equivalent 
to one another in the SVD case. 

• (ii) The SVD decomposition g = ^,„>i rn Sn is unique. 

• (iii) At iteration n, Ylk=i ^fc ® Sfc is the minimizer of \g — Ylk=i 4'k®'4^k\^ over all 
possible {(t)k,^k)i<k<n e {L'^i^x) x L'^i^.y))"'. 

In addition, simple arguments show that, 

• (iv) The only solutions to the Euler Lagrange equations (f38]) are the null solution 
(0, 0) and the tensor products XnUn ^ Vn (for all n > 1) in the SVD decomposition 
of g. 

• (v) The solutions to the Euler-Lagrange equations which maximize the L^-norm 

1 /2 

{In \^ ® •^1^) exactly the solutions to the variational problem ([36]) . 

• (vi) In dimension N = 2, the solutions to the Euler-Lagrange equation that satisfy the 
second order optimality conditions are exactly the solutions of the original variational 
problem (f36ll . 

Notice that there is no loss of generality in assuming > 0, and (A,i)n>i decreasing 
in ([4T]l (up to a change of the {un-,Vn)). The fundamental assumption in nondegeneracy 
is thus that / Am, if n / m. When the decomposition has some degeneracy {i.e. 
several n correspond to the same An in ([4T]) ) then properties (i)-(iii)-(v)-(vi) still hold true. 
On the other hand, in (ii) the SVD is only unique up to rotations within eigenspaces and 
property (iv) must be modified accordingly. In short, the only other solutions beyond those 
mentioned above consist of tensor products of linear combinations of functions within a 
given eigenspace. We skip such technicalities. The degenerate case indeed does not differ 
much from the non degenerate case above in the sense that a complete understanding of 
the algorithm, both in its variational and in its non variational forms, is at hand. 

Let us briefly outline the proofs of assertions (iv)-(v)-(vi). 
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We first prove assertion (iv). It is sufficient to consider the first iteration of the algo- 
rithm. Using the SVD decomposition of g, the Euler-Lagrange equations write: for any 
functions (r, s) € L'^i^x) x L'^i^y), 

Jn „>i Jn 

Using the orthogonahty property, and successively (r, s) = (0,u„) and (r, s) = (u„,0) as 
test functions, we get 

Jn, 



n^ 



nur. 



which yields: Vn > 1 



riu, 




(A. 



0. 



= 0, or there exists a unique no 
Jq riUn = (because the product 



Since for n ^ m, Xn ^ Am, this shows that either ri si 
such that Xno = \J Jq \ri (8> sip and Vn / no, siVn 

siVn riUn cancels and thus each of the term cancels because of the Euler Lagrange 
equations). Since by the Euler-Lagrange equations, ri (resp. si) can be decomposed on the 
set of orthogonal functions {un, n > 1) (resp. (f„, n > 1)), we get ri (?) si = Xn^Uno ® ^^noi 
which concludes the proof of assertion (iv). Assertion (v) is readily obtained using (iv) 
and the orthogonality property. Notice that assertion (ii) is a consequence of assertions 
(v). To prove assertion (vi), we recall that the second order optimality condition writes 



IV 



(see LemmalS, adapted to the SVD case): V(r, s) G L'^{^x) x i^(^^ 



yh 



gn)r ® s 



< 



n 



r„ (g) s 



n 



(43) 



1. Let us consider a solution of the Euler- 
and let us take as test functions in (SHI) 



It is again enough to consider the case n 
Lagrange equation: ri ® si = XngUno ^ Vno, 
(r, s) = {un,Vn), for all n > 1. We obtain that for all n > 1, (A„)^ < (A„q)^ which 
concludes the proof of assertion (vi). Notice that in dimension > 3, assertion (vi) 
seemingly does not hold: the solutions to the Euler-Lagrange equation that satisfy the 
second order optimality conditions may not necessarily be global minimizers. 



4.1.3 Link between the Euler-Lagrange equations and the variational problem 

Properties (iv)-(v)-(vi) above tend to indicate that, at least in the SVD case, the considera- 
tion of the solutions to the Euler-Lagrange equations is somehow close to the consideration 
of the minimization problems. 

Indeed, if we assume that at each iteration, non zero solutions of the Euler-Lagrange 
equations are obtained (of course under the assumption gn-i / in (f38]l ). then the non 
variational form of the algorithm, if it converges, will eventually provide the correct de- 
composition. We however would like to mention two practical difficulties. 

First, it is not clear in practice how to compute the norm \\gn\\ to check the convergence, 
since this is in general a high dimensional integral. A more realistic convergence criterion 
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would read: 11 Tfi ^ Sfi II is small compared to 
it is possible to erroneously conclude that t 



En—l ^ 



However, using this criterion, 



the algorithm has converged, while a term with 
an arbitrarily large contribution has been missed. Indeed, consider again, to convey the 
idea, the case (l4T]l - ([42]) . Assume that the tensor product X2U2'^V2 is picked at first iteration 
(instead of the tensor product Xiui0vi which would be selected by the variational version 
of the algorithm). Assume similarly that A3U3 fs is picked at second iteration, and so 
on and so forth. In such a situation, one would then decide the series A„n„ f„ solves 

n>2 

the problem, while obsviously it does not. We will show below (see Section I4.1.4P that 
in the simple fixed-point procedure we have described above to solve the nonlinear Euler- 
Lagrange equations, the fact that AiUi (8) vi is missed, and never obtained as a solution, 
may indeed happen as soon as the initial condition of the iterative procedure has a zero 
component on the eigenspace associated to Ai. 

Second, without an additional assumption reminiscent of the minimizing character of 
the solution, iteratively solving the Euler-Lagrange equations may result in picking the 
tensor products XnUn <^ Vn in an order not appropriate for computational efficiency. Such 
an assumption is present in assertions (v) and (vi). For the illustration, let us indeed 
consider a SVD decomposition 

n>l 

for some functions Un and that become highly oscillatory when n grows. It is clear 
that we may obtain an error in norm that is arbitrarily large at each iteration of the 
algorithm. In particular, it may happen (in particular if smooth functions are chosen as 
initial guesses for the nonlinear iteration loop solving the Euler-Lagrange equation) that 
the highly oscillatory products are only selected in the latest iterations, although they con- 
tribute to the error in a major way. A poor efficiency of the algorithm follows. Inevitably, 
reaching computational efficiency therefore requires to account for some additional assump- 
tions to select the appropriate solutions among the many solutions of the Euler-Lagrange 
equations. 

In the spirit of the above discussion, one can notice that 

• (vii) The null solution (0, 0) to the Euler-Lagrange equation ([38]l is generically not 
isolated within the set of all solutions. 

Indeed, consider a SVD g = A„ Un®Vn, such that Un and f„ are non-zero functions for 

n>l 

all n > 1 (and A„ > 0). Then, any {XnUn,Vn) is a solution of the Euler Lagrange equation 
at the first iteration, and the norm of the {XnUn,Vn) which is selected may be arbitrarily 
small since the series X]„>i A^ u„ (gi Vn converges, and therefore ||A,iM,i ® Vn\\ goes to zero. 
A similar argument applies to all iterations of the algorithm. Therefore, a criterion of 



convergence of the type (g Sn 1 1 is small compared to 



^1=1 Tk Sk 



may again yield 



an erroneous conclusion and lead to a prematurate termination of the iterations. 

Remark 7 Note of course that the relaxation step performed in the orthogonal version of 
the algorithm does not solve any of the above difficulties. 

4.1.4 Resolution of the Euler-Lagrange equations 

A last comment we would like to make on the SVD case again concerns the practical 
implementation of the solution procedure for the Euler-Lagrange equations. Consider the 
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discrete case for clarity. The fixed point procedure then simply writes (for a fixed n): at 
iteration A; > 0, compute two vectors {R^, S^) gWxR'i g^^j^ ^j^^t: 



One can check that this procedure is similar to the power method to compute the largest 



where || • || here denotes the Euclidean norm and where we have omitted the subscripts 
n and n — 1 for clarity. To study the convergence of this algorithm one can assume that 
G is actually a diagonal matrix up to a change of coordinate. Indeed, let us introduce 
the SVD decomposition of G: G = U'EV'^ where U and V are two orthogonal matrices, 
and S is a diagonal matrix with non-negative coefficients. Without loss of generality, we 
may assume that q < p, U £ RP""^, S G M"^'?, V G M^^^ and > ^2,2 > • • • > ^g,q. 
For simplicity, assume that > S2,2 > 0. Then, setting 5^ = V'^S'', the recursion 

~ ~ II II 

reads 5^^"^ = (S-^I!)5'^ ||ss''|P ^^'^ convergence is easy to study. One can check that 
if the initial condition has a non-zero component along the vector associated to the 
largest value then S'^ converges to this vector. The convergence is geometric, with 
a rate related to (^^ least if the initial condition has a non-zero component along 
the vector associated to ^2,2, otherwise S2,2 should be replaced by the appropriate largest 
^k,k, with A; > 1). Of course, if the initial condition is not well chosen (namely, if has a 
zero component along the vector associated to ^1,1), then this algorithm cannot converge 
to the solution of the variational version of the algorithm. 

We would like to mention that this method to compute the SVD of a matrix is actually 
known to poorly perform in practice. More precisely, the approach is very sensitive to 
numerical perturbations, see [9[ Lecture 31]) since the condition number of (G„_i)^G„_i 
is typically large. Alternative methods exist that compute the SVD decomposition, and it 
would be interesting to use these techniques as guidelines to build more efficient procedures 
to solve the nonlinear Euler-Lagrange equations (fTSll . 

4.2 Euler-Lagrange approach for the Poisson problem 

We now return to the solution of the Poisson problem. Our purpose is to see which of the 
above mentioned difficulties survive in this case. We shall also see new difficulties appear. 

We first observe, on a general note, that a property similar to (l40]) holds in the Poisson 
case, namely: 



V > rfc^Sfc • V(r„^s«_i) = / V > r^^Sfc • V(n_i s„) = 0. (45) 



This, however, does not seem to imply any simple orthogonality property as (f37l) . In 
particular, in the Poisson case, it is generally wrong that, for n ^ tti, ^{vn (E) Sn) ■ 
V(rm, O Sm) = 0. 

Next, we remark that none of the properties (i)-(ii)-(iii) holds in the Poisson case. Like- 
wise, we are not able to characterize the list of solutions to the Euler-Lagrange equations 
as we did in (iv)-(v)-(vi). 




(44) 






21 



This is for the generic situation, but in order to better demonstrate the connections 
between the SVD case above and the Poisson case, let us show that, in fact, the Poisson 
case necessarily embeds all the difficulties of the SVD case. For this purpose, we consider 
the original algorithm (for the Poisson problem) performed for a particular right-hand-side 
f = - Ag, namely 

9 = J2k=i (^k(t>k ® ipk where Ok G M, 

(pk (resp. tpk) are eigenfunctions of 
< (46) 
the homogeneous Dirichlet operator —dxx (resp. —dyy) 

^ and satisfy yk,l, J 4)k(t)i = J ipkipi = Sk,h 

where 6k,i is again the Kronecker symbol. Then, it can be shown that, as in the SVD case, 
Tk^Sk = ctk't'k'^ipk are indeed solution to the Euler-Lagrange equations (fT4l) . This suffices 
to show the non uniqueness of the solution. Furthermore, and in sharp contrast to (iv), 
there even exist solutions to the Euler Lagrange equations that are not of the above form. 

Here is an example of the latter claim. Consider the case (pi = tpi, associated with an 
eigenvalue Ai and (j)2 = ip2, associated with an eigenvalue A2 / Ai. We suppose = for 
k > 2. We are looking for r and s solution to the Euler-Lagrange equations 

s|V' + J |s'|V = j fs, 
< 

- j |r|2s" + j Ir'ps = j fr. 

Then, it can be checked that r = ri(pi+r2(p2 and s = sitlji + S21P2 are solution to the Euler- 
Lagrange equations, with the following set of parameters: ri = 1, r2 = 1/2, si = 2, S2 = 1, 
ai = ^''^Ixi^^ = ^^2^~- Likewise, it is immediate to see that (vii) still holds. In 

view of the above remarks, it seems difficult to devise (and, even more difficult, to prove 
the convergence of) efficient iterative procedures to correctly solve the Euler-Lagrange 
equation. 




4.3 Some numerical experiments and the non self-adjoint case 

We now show some numerical tests. Even though the algorithms presented above have 
been designed for solving problems in high dimension, we restrict ourselves to the two- 
dimensional case. For numerical results in higher dimension, we refer to [1]. Moreover, 
we consider the discrete case mentioned in Section I4T1 which writes (compare with ([3])): 
for a given symmetric positive definite matrix D G M'^^'^ (which plays the role of the one- 
dimensional operator —dxx), and a given matrix F S M'^^"' (which plays the role of the 
right-hand side /): 

Find GeR"^""^ such that DG + GD = F. (47) 

Here, the dimension d typically corresponds to the number of points used to discretize the 
one-dimensional functions r„ or s„. To this problem is associated the variational problem 
(compare with (|4|) 

Find G£M!^'"^ such that G = arg min (^R±ER - F ) : f/, (48) 

where, for two matrices A,B^ M'^^'^, A: B = YIiKi j<d^i,i^id- The matrix G is built as 
a sum of rank one matrices RkS'^ with {Rk, Sk) G (M"')^, using the following Pure Greedy 
Algorithm (compare with the algorithm presented in Section [2?T]) : Set Fq = F and at 
iteration n > 1 , 



22 



1. Find Rn and S",,, two vectors in M'^ such that: 

(ii„,5„) = arg min ( ^^^^^^^^^ - Fn-i) : (RS^). (49) 

2 . SetS F„ = - {DRnSj; + i2„5jZ)) . 

3. If Il-Fnil > £) proceed to iteration n+1. Otherwise stop. 

As explained in Section [2^ Step 1 of the above algorithm is replaced in practice by the 
resolution of the associated Euler-Lagrange equations. This consists in finding two vectors 
Rn and 5„ in solution to the nonlinear equations: 

{ll'S'niP DRn + Rn = Fn—lSn, (50) 

||-Rn|P DSn + ll-^nlll) Sn = F!^_^Rn, 

where, for any vectors R G M^, we set = R^DR. This nonlinear problem is solved 

by a simple fixed point procedure (as (p4l) ). We have observed in practice that choosing 
a random vector as an initial condition for the fixed point procedure is more efficient 
than taking a given deterministic vector (like (1, . . . , 1)^). This is of course related to the 
convergence properties of the fixed point procedure we discussed in Section 14.1.41 



4.3.1 Convergence of the method 

In this section, we take D diagonal, with (1,2,..., d) on the diagonal, and a random matrix 
F. The parameter e is 10~^. We observe that the algorithm always converges. This means 
that, in practice, the solutions of the Euler-Lagrange equations ([50]l selected by the fixed 
point procedure are appropriate. 

On Figure [H we plot the energy (^ du„+u„d _ . ^^^^ ^^^^^ jj^ ^ ^n^^ ^^^T 

observe that the energy rapidly decreases and next reaches a plateau. This is a general 
feature that we observe on all the tests we perfomed. 

In Table [U we give the number of iterations necessary for convergence, as a function 
of d. We observe a linear dependency, which unfortunately we are unable to explain 
theoretically. 



d 


10 20 30 


Number of iterations 


22-23 45-46 69-70 



Table 1: Number of iterations typically needed for convergence as a function of d, for 
various random matrices F [D = diag{[linspace{l,2, d)]), e = 10~^). 



4.3.2 The non self-adjoint case 

In [1], it is actually proposed to use the Orthogonal Greedy Algorithm for non self adjoint 
operators. 

Consider, for the prototypical case of an advection diffusion equation: 

Find g £ H^{n) such that | « " ^5 - A5 = / in 

I g = ij on oil, 



^In practice, to avoid numerical cancellation, we actually set Fn — F — {DUn + U„D) where Un 

En r> qT 
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Figure 1: Evolution of the energy as a function of iterations (d = 10, D = 
diag{[linspace{l,2,d)]), e = 10~^). 

where a : ^ is a given smooth velocity field. When a = W for some real-valued 
function V, problem (jSlh is equivalent to minimizing the energy 



When this is not the case, it is not in general possible to recast (f5T]l in terms of a mini- 
mization problem. However, a variational formulation can be written as: Find g G Hq{^1.) 
such that, for all v € Hq{Q), 



I (a • Vg)v + Vg-Vv= [ fv. 
Jn Jn 



It is proposed in [Ij to use this variational formulation in step 1 and 2 of the Orthogonal 
Greedy Algorithm. The iterations then write: set fo = f, and at iteration n > 1, 

1. Find r„ G Hq{Qx) and s„ G i/o(r2y) such that, for all functions (r, s) G 



(a-V(r„(g)s„))(r„(g)s+r(g)s„)+V(r„(g)s„)-V(r„(g)s+r(gis„) = / /n-i(r„(g)s+r(gis„). 
n Jn 

(52) 

2. Find u„ G Vect(ri (g) si, . . . , r„ (g) s„) such that for all G Vect(ri (g) si, . . . , r„ (g) 

Sn) 

/ (a • Vun)v + Vun ■Vv= fv. (53) 
Jn Jn 
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3 . Set /„ = fn~i - (a • Vu„ - Aun) ■ 



4. If \\ fn\\ H - i(n) ^ ^ > proceed to iteration n + 1. Otherwise, stop. 



The corresponding discrete formulation reads: 



Find G G M'^'"^ such that BG + GB^ 



(54) 



where B is not supposed to be symmetric here (compare to (l47l) ). The numerical method 
reads: Set Fq = F and at iteration n > 1, 



3. If \\Fn\\ >£, proceed to iteration n+1. Otherwise stop. 

We consider the case when B = D + A with D symmetric positive definite, and A 
antisymmetric, so that we know there exists a unique solution to ([54l) . On the numer- 
ical tests we have performed, the algorithm seems to converge. In the absence of any 
energy minimization principle, it is however unclear to us how to prove convergence of this 
algorithm. 
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1. Find Rn and Sn two vectors in 



such that : 




(55) 



2 . Set Fn = Fn-l 



(BRnSl + RnSlB^) . 
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